Skip to main content

Introduction

Use the Lip Sync API to generate a new video where the speaker’s mouth movement is synchronized to a target audio track. Submit a source video URL and an audio URL, then poll the task status until the final result is ready.

Key Features

Async Processing

Submit once and retrieve results when processing completes

Simple Input

Create tasks with just a video URL and an audio URL

Flexible Playback

Choose how to handle cases where video is shorter than audio

Result Metadata

Get output video URL, cover image, duration, and error details

Workflow Overview

Lip sync video generation is an asynchronous 3-step process:
1

Submit Task

Call the create endpoint with your source video and source audio URLs
2

Processing

JoggAI performs lip sync generation in the background
3

Retrieve Result

Poll the task endpoint until status becomes success or failed
Lip sync generation is asynchronous. After submitting the task, store the returned task_id and use it to poll for progress and results.

Quick Start

EndpointPurposeDocumentation
POST /open/v2/create_lip_sync_videoSubmit lip sync taskAPI Reference
GET /open/v2/lip_sync_video/{task_id}Check lip sync task statusAPI Reference

Key Parameters

ParameterTypeRequiredDescription
video_urlstringPublicly accessible source video URL
audio_urlstringPublicly accessible source audio URL
playback_typestringPlayback behavior when source video is shorter than audio

Playback Type Values

ValueDescription
normalPlay the source video normally
normal_reverseAlternate forward and reverse playback to extend the video
normal_reverse_by_audioExtend playback dynamically based on source audio duration

Pricing

RuleValue
Credits per duration1 credit per 125 seconds
Approximate cost per second0.008 credits/second
Display ruleRound up and keep 2 decimal places
Estimated usage can be calculated from output duration. Since 1 / 125 = 0.008, the displayed unit cost is approximately 0.01 credits per second when rounded up to 2 decimal places.
The video_url and audio_url must be publicly accessible direct URLs. Temporary or authenticated links may fail during processing.

Code Examples

Step 1: Submit Lip Sync Task

curl --request POST \
  --url 'https://api.jogg.ai/open/v2/create_lip_sync_video' \
  --header 'x-api-key: YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "video_url": "https://res.jogg.ai/source-video.mp4",
    "audio_url": "https://res.jogg.ai/source-audio.wav",
    "playback_type": "normal"
  }'
Response:
{
  "code": 0,
  "msg": "Success",
  "data": {
    "task_id": "3d5c6930-d0da-4f7b-826e-cd1530f6734f",
    "status": "pending"
  }
}
Save the task_id from the response. You will need it to poll the lip sync task result.

Step 2: Check Task Status

curl --request GET \
  --url 'https://api.jogg.ai/open/v2/lip_sync_video/3d5c6930-d0da-4f7b-826e-cd1530f6734f' \
  --header 'x-api-key: YOUR_API_KEY'
Response (Processing):
{
  "code": 0,
  "msg": "Success",
  "data": {
    "task_id": "3d5c6930-d0da-4f7b-826e-cd1530f6734f",
    "status": "processing",
    "created_at": 1741500000,
    "completed_at": null,
    "data": null,
    "error": null
  }
}
Response (Success):
{
  "code": 0,
  "msg": "Success",
  "data": {
    "task_id": "3d5c6930-d0da-4f7b-826e-cd1530f6734f",
    "status": "success",
    "created_at": 1741500000,
    "completed_at": 1741500030,
    "data": {
      "result_url": "https://res.jogg.ai/lipsync-result.mp4",
      "cover_url": "https://res.jogg.ai/lipsync-cover.jpg",
      "duration_seconds": 12.5
    },
    "error": null
  }
}
Response (Failed):
{
  "code": 0,
  "msg": "Success",
  "data": {
    "task_id": "3d5c6930-d0da-4f7b-826e-cd1530f6734f",
    "status": "failed",
    "created_at": 1741500000,
    "completed_at": 1741500030,
    "data": null,
    "error": {
      "message": "consume retry limit reached: audio duration fetch failed"
    }
  }
}

Status Values

StatusDescriptionAction
pendingTask has been accepted and queuedWait, then poll again
processingLip sync is currently being generatedContinue polling
successFinal video is readyDownload result_url
failedTask could not be completedInspect error.message
Poll every 5-10 seconds for active tasks. In production, avoid overly aggressive polling to reduce unnecessary API traffic.

Use Case Examples

Replace or localize speech in short-form videos while preserving the speaker’s visual delivery.
Reuse an existing talking-head video with a different audio track for updated messaging or campaigns.
Generate multiple lip sync tasks for different audio tracks and languages using the same base video.

Tips for Best Results

Input quality matters:
  • Use a clear frontal face shot when possible
  • Avoid heavy occlusion around the mouth area
  • Use clean audio with stable volume
  • Prefer publicly accessible CDN URLs over temporary download links
Playback strategy guidance:
  • Use normal when video duration already matches the audio well
  • Use normal_reverse when you want a simple loop-like extension effect
  • Use normal_reverse_by_audio when alignment should adapt to the audio duration
Polling guidance:
  • Start polling a few seconds after task creation
  • Poll every 5-10 seconds during active processing
  • Stop polling once status is success or failed

Troubleshooting

Issue: Task fails because the video or audio file cannot be downloaded.Solutions:
  • Ensure both URLs are publicly accessible
  • Avoid signed URLs that expire too quickly
  • Verify the files are reachable from outside your network
  • Confirm the linked file is the actual media file, not an HTML preview page
Issue: Status stays processing longer than expected.Solutions:
  • Continue polling with a moderate interval
  • Retry with smaller or simpler media files if needed
  • Verify source media is stable and downloadable
  • Contact support if the task remains stuck for an extended period

Create Lip Sync Video Task

Full request schema for submitting tasks

Get Lip Sync Video Task

Full response schema for task status and result

Check Video Result & Status

General guidance for polling asynchronous video tasks

Webhook Integration

Recommended callback pattern for async workflows